|
A rolling hash is a hash function where the input is hashed in a window that moves through the input. A few hash functions allow a rolling hash to be computed very quickly—the new hash value is rapidly calculated given only the old hash value, the old value removed from the window, and the new value added to the window—similar to the way a moving average function can be computed much more quickly than other low-pass filters. One of the main applications is the Rabin-Karp string search algorithm, which uses the rolling hash described below. Another popular application is rsync program which uses a checksum based on Mark Adler's adler-32 as its rolling hash. Another application is the Low Bandwidth Network Filesystem (LBFS), which uses a Rabin fingerprint as its rolling hash. At best, rolling hash values are pairwise independent〔Daniel Lemire, Owen Kaser: Recursive n-gram hashing is pairwise independent, at best, Computer Speech & Language 24 (4), pages 698-710, 2010. arXiv:0705.4676〕 or strongly universal. They cannot be 3-wise independent, for example. == Rabin-Karp rolling hash == The Rabin-Karp string search algorithm is normally used with a very simple rolling hash function that only uses multiplications and additions: where is a constant and are the input characters. In order to avoid manipulating huge values, all math is done modulo . The choice of and is critical to get good hashing; see linear congruential generator for more discussion. Removing and adding characters simply involves adding or subtracting the first or last term. Shifting all characters by one position to the left requires multiplying the entire sum by . Shifting all characters by one position to the right requires dividing the entire sum by . Note that in modulo arithmetic, can be chosen to have a multiplicative inverse by which can be multiplied to get the result of the division without actually performing a division. 抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「rolling hash」の詳細全文を読む スポンサード リンク
|